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We  develop  a  mathematical  model  to  quantitatively  analyze  a  scalable  region-based 
hierarchical  group  key  management  protocol  integrated  with  intrusion  detection  to  deal 
with  both  outsider  and  insider  security  attacks  for  group  communication  systems  (GCSs) 
in  mobile  ad  hoc  networks  (MANETs).  Our  proposed  adaptive  intrusion  detection  technique 
is  based  on  majority  voting  by  nodes  in  a  geographical  region  to  cope  with  collusion 
of  compromised  nodes,  with  each  node  preloaded  with  anomaly-based  or  misuse-based 
intrusion  detection  techniques  to  diagnose  compromised  nodes  in  the  same  region.  When 
given  a  set  of  parameter  values  characterizing  operational  and  environmental  conditions, 
we  identify  the  optimal  intrusion  detection  rate  and  the  optimal  regional  area  size  under 
which  the  mean  time  to  security  failure  of  the  system  is  maximized  and/or  the  total 
communication  cost  is  minimized  for  GCSs  in  MANET  environments.  The  tradeoff  analysis 
in  performance  versus  security  is  useful  in  identifying  and  dynamically  applying  optimal 
settings  to  maximize  the  system  lifetime  for  scalable  mobile  group  applications  while 
satisfying  application-specific  performance  requirements. 
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1.  Introduction 

Many  mobile  applications  in  wireless  networks  such  as  military  battlefields,  emergency  response,  mobile  commerce, 
online  gaming,  and  collaborative  work  are  based  on  the  notion  of  group  communications.  Designing  security  protocols 
for  secure  group  communication  systems  (GCSs)  in  mobile  ad  hoc  networks  (MANETs)  faces  many  technical  challenges 
including  resource-constrained  environments  (e.g.,  bandwidth,  memory  size,  battery  life,  and  computational  power), 
openness  to  eavesdropping  and  security  threats,  unreliable  communication,  no  infrastructure  support,  and  rapid  changes  in 
network  topology  due  to  user  mobility  which  could  cause  group  merge/partition  events  to  occur  dynamically  [1]. 

In  this  paper,  we  are  concerned  with  dynamic  GCSs  in  MANETs  where  mobile  nodes  cooperate  in  a  group  setting  to 
accomplish  assigned  mission  tasks  as  in  military  battlefield  situations.  A  GCS  initially  may  contain  one  mobile  group.  Later 
a  mobile  group  may  be  split  into  two  because  of  node  mobility  and  failure.  Two  mobile  groups  may  merge  into  one  when 
connectivity  is  resumed.  Our  notion  of  a  mobile  group  is  connectivity-oriented,  i.e.,  nodes  are  in  a  mobile  group  as  long  as 
they  are  connected. 

The  GCS  is  mission-oriented  in  that  all  mobile  groups  regardless  of  their  size  will  execute  the  assigned  mission  throughout 
their  lifetime.  Thus,  the  primary  goal  of  a  mission-oriented  GCS  is  to  prolong  its  system  lifetime  in  the  presence  of 
security  attacks  to  increase  the  success  probability  of  mission  execution.  The  secondary  goal  is  to  satisfy  application-specific 
performance  requirements  in  terms  of  timeliness,  throughput,  delay,  or  traffic  capacity.  This  paper  aims  to  solve  the  problem 
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Fig.  1.  Integration  of  region-based  group  key  management  for  dealing  with  outsider  attacks  with  intrusion  detection  for  dealing  with  insider  attacks  for 
a  GCS  in  MANETs. 

of  prolonging  the  system  lifetime  of  such  a  GCS  in  the  presence  of  security  attacks  while  satisfying  system  performance 
requirements. 

To  deal  with  outsider  attacks,  a  cost-effective  way  is  to  use  a  symmetric  key,  called  the  group  key,  shared  by  group 
members.  The  group  key  is  employed  to  encrypt  messages  sent  by  a  member  to  the  group.  Only  members  of  the  group 
with  the  group  key  are  capable  of  decrypting  the  messages.  Thus,  the  group  key  protects  group  communication  information 
shared  by  legitimate  members.  Since  there  is  no  infrastructure  support  in  MANETs,  key  management  must  be  performed 
in  a  fully  distributed  manner.  This  creates  extra  system  overheads  whenever  the  group  key  is  “rekeyed”  because  of  a  group 
member  leave/join/eviction  event.  For  scalability  and  performance  reasons,  we  advocate  the  use  of  region-based  hierarchical 
group  key  management  [2]  for  generating  and  distributing  the  group  key. 

To  deal  with  insider  attacks,  intrusion  detection  system  (IDS)  techniques  [3]  have  been  proposed.  IDS  can  be  employed 
to  detect  compromised  nodes  and  to  evict  such  compromised  nodes  with  the  goal  of  high  survivability  and  availability  to 
prolong  the  system  lifetime.  In  MANETs,  distributed  IDS  must  be  employed  because  it  is  not  practical  to  use  a  centralized 
IDS  server  which  would  be  a  single  point  of  failure.  We  advocate  the  use  of  voting-based  IDS  [4]  to  cope  with  collusion  of 
compromised  nodes  for  survivability. 

In  the  literature,  group  key  management  and  IDS  techniques  for  MANETs  have  been  studied  separately  to  deal  with 
outside  attackers  and  inside  attackers,  respectively.  However,  both  are  needed  in  a  GCS  to  ensure  high  system  survivability 
and  security.  As  a  result,  group  key  management  and  IDS  are  often  deployed  separately,  resulting  in  high  management 
overhead  and  degraded  performance.  In  this  paper,  we  aim  to  demonstrate  the  advantages  of  integrated  management  over 
separate  management.  Specifically,  we  propose  to  integrate  region-based  hierarchical  group  key  management  [2]  with  a 
new  region-voting-based  IDS  scheme  to  result  in  a  highly  survivable  and  efficient  GCS.  We  use  the  mean  time  to  security 
failure  ( MTTSF )  as  the  security  metric  to  measure  the  lifetime  of  a  GCS  system  under  security  attacks.  We  use  the  control 
message  traffic  generated  for  secure  group  communication  as  the  performance  metric.  We  analyze  the  tradeoff  of  security 
versus  performance  properties  of  a  GCS  system  by  means  of  a  mathematical  model  and  identify  optimal  settings  such  that 
the  system  MTTSF  is  maximized  while  the  total  control  message  traffic  incurred  is  minimized  in  the  GCS.  Fig.  1  shows  the 
structure  of  our  integrated  protocol  to  deal  with  both  outsider  and  insider  attacks  while  satisfying  security  and  performance 
requirements  for  GCSs  in  MANETs.  We  shall  demonstrate  that  integrated  management  yields  higher  MTTSF  while  producing 
lower  control  message  traffic  compared  with  separate  deployment  of  key  management  and  IDS  techniques  in  the  literature. 

The  contributions  of  the  paper  are  as  follows.  First,  we  consider  an  integrated  design  to  cope  with  both  outsider  and 
insider  security  attacks,  i.e.,  region-based  group  key  management  for  dealing  with  outsider  attacks  and  region-voting- 
based  IDS  for  dealing  with  insider  attacks,  with  the  goal  to  achieve  secure  GCSs  in  MANETs.  This  is  the  first  work  that 
investigates  the  interplay  between  group  key  management  and  IDS  and  analyzes  their  interwoven  effect  of  how  one  affects 
the  optimal  setting  of  the  other  for  maximizing  the  mean  lifetime  of  the  system.  Second,  we  analyze  the  tradeoff  of  security 
versus  performance  properties  of  GCSs  as  a  result  of  integrating  IDS  with  hierarchical  group  key  management  for  scalability, 
survivability,  and  efficiency.  Third,  we  introduce  region-voting-based  IDS  to  cope  with  collusion  of  compromised  nodes 
to  prolong  the  system  MTTSF  with  a  randomness  design  in  the  vote-participants  and  region-leader  selection  process  to 
mitigate  the  problem  of  a  single  point  of  attack/failure.  Lastly,  we  develop  a  mathematical  model  based  on  stochastic  Petri 
nets  (SPN)  [5]  to  quantitatively  identify  optimal  settings  in  terms  of  the  optimal  regional  area  size  and  the  intrusion  detection 
interval  that  would  maximize  the  system  lifetime  (in  terms  of  MTTSF )  while  satisfying  performance  requirements  (i.e.,  total 
communication  cost  per  time  unit).  The  SPN  model  developed  is  capable  of  dealing  with  a  large  number  of  states  and  the 
underlying  semi-Markov  chain  generated  can  accommodate  any  general  time  distribution  for  modeling  the  event  occurrence 
duration,  thus  allowing  periodic  events  and  non-exponential-time  events  to  be  easily  analyzed.  A  GCS  can  monitor  the 
operational  and  environmental  conditions  characterized  by  a  set  of  model  parameter  values  at  runtime  and  apply  the 
optimal  settings  identified  in  this  paper  at  static  time  by  means  of  a  table  look-up  operation  so  as  to  dynamically  maximize 
the  system  MTTSF  and  satisfy  imposed  performance  requirements. 
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Table  1 

New  contributions  to  region-based  hierarchical  group  key  management. 


Region-based  group  key  management  [2] 

This  work 

Insider  attacks 

No 

Yes 

Outsider  attacks 

Yes 

Yes 

Optimization  design  tradeoffs 

Performance  tradeoff 

Performance  versus  security  tradeoffs 

Optimal  settings 

Optimal  regional  area  size 

Optimal  regional  area  size  and  optimal 
intrusion  detection  interval 

Metrics 

Overall  communication  cost 

Overall  communication  cost,  MTTSF 

Integration  of  group  key  management  with  IDS 

No 

Yes 

Events  considered 

Join,  leave,  group  merge,  group  partition, 
mobility,  and  beaconing 

Join,  leave,  group  merge,  group  partition, 
mobility,  beaconing,  group  communication, 
intrusion  detection,  and  node  eviction 

As  this  paper  integrates  region-based  hierarchical  group  key  management  [2]  with  region-voting-based  IDS  to  deal  with 
both  outsider  and  insider  attacks,  we  summarize  this  paper’s  new  contributions  with  respect  to  our  prior  work  in  region- 
based  hierarchical  group  key  management  in  Table  1. 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  surveys  related  work  in  dealing  with  insider  and  outsider  attacks 
for  secure  GCSs  in  MANETs.  Section  3  describes  our  proposed  integrated  hierarchical  group  key  management  and  region- 
voting-based  IDS  protocol.  Section  4  describes  the  system  model  covering  the  assumptions  used,  security  failure  conditions 
defined,  performance  and  security  metrics  used,  and  attack  model.  Section  5  develops  a  mathematical  model  based  on  SPN 
and  explains  how  we  model  group  merge/partition  events  and  how  we  calculate  security  and  performance  metrics  from  the 
mathematical  model  developed.  Section  6  shows  analytical  results  obtained  with  physical  interpretations  given.  Section  7 
discusses  the  applicability  of  the  proposed  integrated  protocol  and  summarizes  the  paper  with  future  work  outlined. 

2.  Related  work 

Several  hierarchical  group  key  management  protocols  have  been  proposed  in  the  literature  for  scalability  reasons. 
1GKMP  [7,6]  divides  a  group  into  several  subgroups  to  enhance  scalability  and  applies  several  rekeying  algorithms  to  preserve 
secrecy  properties  as  members  move  within  the  hierarchy.  However,  it  is  not  suitable  for  MANETs  where  nodes  are  mobile 
and  wireless  communication  is  often  unreliable.  Rafaeli  et  al.  [8]  proposed  Hydra,  Dondeti  et  al.  [9]  proposed  DEP,  and 
Mittra  et  al.  [10]  proposed  Iolus  for  secure  multicasting  communication.  However,  all  are  based  on  the  use  of  static  subgroup 
controllers  in  the  system  and  are  also  not  suitable  for  MANETs.  In  [  1 1  ],  HKT  was  proposed  to  balance  security  and  efficiency, 
making  use  of  a  two-level  hybrid  key  tree  based  on  clusters.  Cluster  sizes  are  adjusted  depending  on  the  level  of  collusion 
resistance.  However,  HKT  is  designed  for  wired  networks.  Hierarchical  group  key  management  protocols  developed  for 
MANETs  include  [12-19],  Rhee  et  al.  [18]  employed  a  two-layer  hierarchical  key  management  structure  for  secure  group 
communications  for  unmanned  aerial  vehicles  (UAVs).  They  considered  the  use  of  a  stationary  super-node  as  a  cluster 
head.  Bechler  et  al.  [15]  proposed  an  efficient  distributed  key  management  based  on  hierarchical  clustering.  However,  no 
optimal  setting  was  identified  to  maximize  system  performance.  Lazos  et  al.  [16]  considered  a  hierarchical  key  management 
structure  for  energy-aware  secure  multicast  group  communication  in  MANETs  based  on  geographic  routing.  They  assumed 
a  fixed  cluster  size  without  identifying  the  optimal  cluster  size  to  maximize  system  performance.  Furthermore,  they  only 
considered  group  join/leave  events  without  considering  group  partition/merge  events  due  to  node  mobility.  In  [  1 2, 1 3, 1 7,20], 
the  optimal  cluster  or  group  size  to  minimize  rekeying  cost  was  studied.  The  analysis,  however,  is  not  for  mission-oriented 
GCSs  in  MANETs. 

Very  recently,  we  investigated  a  region-based  group  key  management  protocol  [2]  to  improve  the  system  performance 
for  group  key  generation  and  distribution  for  GCSs  in  MANETs.  Their  protocol  breaks  the  operational  area  into  regions  to 
reduce  the  group  key  management  overhead  and  to  make  the  protocol  scalable  to  a  large  number  of  nodes  in  a  group.  An 
optimal  regional  area  size  was  identified  based  on  the  tradeoff  between  inter-regional  and  intra-regional  communication 
cost.  However,  it  deals  with  outsider  attacks  only  with  the  focus  on  communication  cost  minimization.  This  paper  extends  [2] 
by  integrating  the  hierarchical  group  key  management  protocol  with  region-voting-based  IDS  in  order  to  deal  with  both 
insider  and  outsider  attacks,  with  the  goal  to  achieve  not  only  high  scalability  and  efficiency,  but  also  high  survivability  for 
GCSs  in  MANETs. 

Recent  research  efforts  on  IDS  protocols  developed  for  MANETs  focus  on  cluster-based  IDS  for  scalability.  A  major  issue 
is  that  the  cluster  head  is  a  single  point  of  failure.  The  main  idea  behind  cluster-based  IDS  is  that  instead  of  performing 
host-based  IDS  at  each  node,  a  cluster  head  (CH)  is  selected  to  collect  security-related  information  from  nodes  in  a  cluster 
and  determines  if  intrusion  has  occurred.  Non-overlapping  zone-based  IDS  was  proposed  in  [21,22]  for  MANETs  and  proven 
to  be  effective  in  intrusion  detection.  Marti  et  al.  [23]  developed  a  watchdog  mechanism  for  identifying  misbehaving  nodes 
based  on  dynamic  behaviors  and  developed  a  pathrater  algorithm  for  routing  around  misbehaving  nodes  for  MANETs.  Debar 
etal.  [24]  suggested  aggregation  and  correlation  of  IDS  alerts  to  reduce  communication/computational  overhead  caused  by 
performing  IDS.  Hierarchical  IDS  was  proposed  in  [25-27]  to  realize  distributed  anomaly-based  IDS  in  MANETs.  However, 
the  issues  of  extra  latency  and  energy  consumption  are  not  addressed.  The  assumptions  that  the  CH  is  tamper-resistant  and 
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the  CH  selection  process  will  not  be  interrupted  by  attackers  are  also  questionable.  We  advocate  region-voting-based  IDS 
by  which  multiple  nodes  in  a  region  participate  in  the  eviction  process  of  a  target  node  and  perform  a  distributed  majority 
voting  to  determine  if  the  target  node  is  to  be  evicted.  This  eliminates  a  single  point  of  failure  in  cluster-based  IDS  protocols. 

Little  work  addresses  both  security  and  performance  issues  of  IDS.  Stern  et  al.  [28]  proposed  data  reduction  techniques 
to  reduce  the  communication  cost  in  their  IDS  design.  Subhadrabandhu  etal.  [29-31]  studied  the  tradeoff  between  energy, 
computational,  and  communication  resource  consumption  versus  IDS  accuracy  based  on  distributed  IDS.  Our  work  also 
concerns  the  effect  of  IDS  on  security  and  performance  properties  of  the  system  since  we  aim  to  determine  the  best  IDS 
operational  settings  to  prolong  the  system  lifetime  against  security  attacks  while  satisfying  the  performance  requirement. 
However,  our  work  differs  from  these  prior  works  in  that  we  integrate  region-voting-IDS  with  hierarchical  key  management 
to  deal  with  both  insider  and  outsider  attacks  and  our  integrated  protocol  specifically  deals  with  secure  GCSs  in  MANET 
environments. 

Our  work  has  its  root  in  model-based  quantitative  analysis  [32],  In  the  literature,  not  much  work  has  been  done  in 
extending  model-based  quantitative  analysis  to  security  analysis.  Zhang  et  al.  [6]  analyzed  several  group  rekeying  algorithms 
in  wireless  environments  and  evaluated  their  performance  characteristics.  Dacier  et  al.  [33]  proposed  a  system  model  using 
a  privilege  graph  demonstrating  operational  security  vulnerabilities  and  transformed  the  privilege  graph  into  a  Markov 
chain  based  on  all  possible  successful  attack  scenarios.  Jonsson  etal.  [34]  presented  a  quantitative  Markov  model  of  attacker 
behaviors  by  proposing  multiple  phases,  such  as  learning,  standard  attack,  and  innovative  attack.  Popstojanova  et  al.  [35] 
presented  a  state  transition  model  to  describe  dynamic  behaviors  of  intrusion  tolerance  systems  including  a  framework 
defining  the  vulnerability  and  the  threat  set.  Madan  et  al.  [36,37]  employed  a  Semi-Markov  Process  (SMP)  model  to  evaluate 
security  attributes  of  an  intrusion-tolerant  system.  A  steady-state  analysis  has  been  used  to  obtain  dependability  measures 
such  as  availability.  A  transient  analysis  with  absorbing  states  has  been  used  to  obtain  security  measures  such  as  MTTSF 
similar  to  the  computation  of  the  mean  time  to  failure  [MTTF)  in  reliability  analysis.  Stevens  et  al.  [38]  also  proposed  a 
networked  intrusion  tolerant  information  system  using  a  model-based  validation  technique,  the  so  called  probabilistic 
modeling.  They  used  two  security  metrics:  the  mean  time  to  discovery  ( MTTD )  refers  to  the  mean  time  between  successive 
discoveries  of  unknown  vulnerabilities  and  the  mean  time  to  exploit  (M7TE)  refers  to  the  mean  time  between  successive 
exploitations  of  a  known  vulnerability.  Wang  et  al.  [39]  utilized  a  higher-level  formalism  based  on  SPN  for  security  analysis  of 
intrusion  tolerant  systems.  Recently,  Leversage  and  James  [40]  suggested  a  security  metric  to  intelligently  compare  systems 
and  to  make  corporate  security  decisions.  They  proposed  a  mean  time-to-compromise  (MTTC)  metric  to  measure  the  time 
needed  for  an  attacker  to  successfully  disrupt  a  target  system.  Most  previous  works  cited  above,  however,  often  only  focused 
on  security  measures  without  considering  the  impact  of  deploying  security  protocols  on  the  performance  of  the  system. 
In  this  paper,  we  develop  model-based  analysis  techniques  that  consider  both  security  and  performance  aspects  with  the 
objective  to  identify  operational  settings  under  which  both  security  and  performance  requirements  can  be  best  satisfied. 

To  the  best  of  our  knowledge,  no  existing  work  considers  the  integration  of  group  key  management  (to  deal  with  outsider 
attacks)  with  intrusion  detection  mechanisms  (to  deal  with  insider  attacks)  for  high  scalable,  reconfigurable,  and  survivable 
GCSs  in  MANETs.  Our  work  is  the  first  that  considers  these  two  mechanisms  and  their  interplay  with  analysis  techniques 
being  developed  to  determine  the  best  setting  to  execute  the  integrated  protocol  to  satisfy  both  the  security  and  performance 
requirements  for  GCSs  in  MANETs. 

3.  Preliminaries 

3.1.  IDS  protocol 

To  effectively  integrate  IDS  with  region-based  hierarchical  group  key  management  and  investigate  their  interplay,  we 
develop  a  new  region-voting-based  IDS  protocol  borrowing  the  concepts  of  distributed  revocation  based  on  majority  voting 
in  sensor  networks  [41]  and  voting-based  IDS  design  in  MANETs  [4].  The  protocol  requires  each  node  to  preinstall  host- 
based  IDS  to  evaluate  its  neighbors  based  on  evidences  collected,  mostly  route-related  and  traffic-related  information  [26], 
Route-related  information  may  include  node  velocity,  new  and  stale  routes,  routes  added  by  overhearing,  repaired  routes, 
route  changes,  and  the  average  route  length.  Traffic-related  information  may  include  packet  type  (e.g.,  data,  route  request, 
route  reply),  flow  direction  (e.g.,  received,  sent,  forwarded,  dropped),  sampling  interval,  and  statistics  measures  (e.g.,  the 
count  and  standard  deviation  of  inter-packet  intervals)  [26,42].  Anomalies  of  route-related  and  traffic-related  information 
can  show  evidence  of  malicious  attacks.  Each  node  can  also  actively  collect  information  such  as  observing  if  a  packet  sent 
to  a  neighbor  is  not  forwarded  as  requested.  For  example,  in  the  MAC  layer,  it  may  include  the  total  number  of  channel 
requests,  the  total  number  of  nodes  making  channel  requests,  and  the  largest,  the  mean,  and  the  smallest  of  all  requests. 
In  the  application  layer,  it  may  include  the  total  number  of  requests  made  to  a  service,  the  number  of  different  services 
requested,  the  average  duration  of  a  service,  the  number  of  nodes  that  have  requested  any  service,  and  the  total  number  of 
service  errors. 

The  host-based  IDS  preinstalled  in  each  node  can  be  characterized  by  two  parameters,  namely,  the  per-node  false 
negative  probability  (pi)  and  per-node  false  positive  probability  (p2).  The  host-based  IDS  can  use  any  general  IDS  technique 
characterized  by  pi  and  p2  such  as  misuse  detection  (also  called  signature-based  detection)  or  anomaly  detection  [29],  Under 
region-voting-based  IDS,  compromised  nodes  are  detected  based  on  majority  voting.  Periodically,  a  target  node  would  be 
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evaluated  by  m  vote-participants  dynamically  and  randomly  selected  out  of  group  members  in  the  same  geographical  region 
(see  Section  3.2  below  for  the  meaning  of  a  “region”  under  region-based  hierarchical  group  key  management). 

Our  proposed  region-voting-based  IDS  protocol  performs  its  detection  and  eviction  function  periodically.  In  a  detection 
interval,  each  node  would  be  evaluated  by  m  vote-participants  where  m  is  a  system  parameter  whose  effect  will  be  analyzed 
in  the  paper.  All  nodes  are  loosely  synchronized  and  votes  are  tallied  only  during  the  current  detection  time  interval.  The 
m  vote-participants  are  randomly  selected  from  members  within  the  same  region  as  follows.  Since  members  within  the 
same  region  know  each  other’s  id  and  location,  each  node  will  utilize  a  preinstalled  hash  function  R(x)  that  takes  in  the 
id  +  location  of  the  target  node  concatenated  with  the  id  +  location  of  every  other  node  as  the  hash  key.  The  m  nodes 
that  yield  the  highest  hash  values  will  be  selected  as  the  m  vote-participants  voting  for  or  against  the  target  node.  The 
main  benefit  of  using  a  randomization  function  to  determine  m  vote-participants  is  that  it  does  require  any  communication 
between  nodes.  Because  there  is  no  vote-coordinator,  it  avoids  a  single  point  of  attack/failure.  Each  vote-participant  node 
selected  will  independently  vote  for  or  against  the  target  node  by  disseminating  its  vote  to  all  other  regional  members.  Vote 
authenticity  is  achieved  via  preloaded  public  keys  of  all  other  group  members.  By  this  way,  all  regional  members  know  who  m 
vote-participants  are,  and,  based  on  votes  received,  can  determine  whether  or  not  a  target  node  is  considered  compromised 
and  needs  to  be  evicted  for  security  reasons. 

Since  each  node  has  a  unique  id,  all  group  members  know  which  target  node  is  being  evaluated.  If  the  majority  decided  to 
vote  against  the  target  node,  then  the  target  node  would  be  evicted  from  the  system.  This  adds  intrusion  tolerance  to  tolerate 
collusion  of  compromised  nodes  in  MANETs.  We  characterize  region-voting-based  IDS  by  two  parameters,  namely,  false 
negative  probability  (Pfn)  and  false  positive  probability  (P/p).  Later  we  will  derive  a  formula  to  assess  these  two  parameters 
based  on  (a)  the  per-node  false  negative  and  positive  probabilities  (pi  and  p2);  (b)  the  number  of  vote-participants,  m, 
selected  to  vote  for  or  against  a  target  node;  and  (c)  an  estimate  of  the  current  number  of  compromised  nodes  which  may 
collude  with  the  objective  to  disrupt  the  service  of  the  system.  Since  m  nodes  are  selected  to  vote,  if  the  majority  of  m  voting- 
participants  (i.e.,  >\m/2~\ )  casts  negative  votes  against  a  target  node,  the  target  node  is  considered  compromised  and  will 
be  evicted  from  the  system.  Here  pi  and  p2  are  probabilities  for  characterizing  how  often  a  good  node  will  make  a  false 
alarm.  When  a  good  node  evaluates  a  bad  node,  it  will  fail  to  identify  it  as  a  bad  node  with  probably  pi;  when  a  good  node 
evaluates  another  good  node,  it  will  misdiagnose  it  as  a  bad  node  with  probability  p2.  A  false  alarm  from  one  good  node 
is  not  harmful  as  long  as  there  are  more  than  f m /2“|  other  good  nodes  correctly  evaluating  the  target  node  in  the  voting 
process.  In  the  special  case  in  which  there  is  only  a  single  region,  all  nodes  in  the  group  are  candidates  as  vote-participants 
against  a  target  node.  The  selection  of  m  vote-participants  from  member  nodes  from  a  geographical  region  of  different  sizes 
affects  the  security  and  performance  aspects  of  region-voting-based  IDS  for  which  we  aim  to  study  its  effect  in  this  paper. 


3.2.  Region-based  group  key  management  protocol 

In  this  paper,  we  extend  region-based  group  key  management  [2]  to  include  intrusion  detection  events  to  tolerate 
insider  attacks.  The  region-based  group  key  management  scheme  divides  a  group  into  region-based  subgroups  based  on 
decentralized  key  management  principles.  This  protocol  requires  each  group  member  to  be  equipped  with  GPS  to  know 
its  location  as  it  moves  across  regions.  When  a  member  crosses  a  regional  boundary,  it  changes  its  subgroup  “regional” 
membership  although  it  is  still  a  member  of  the  group.  For  secure  group  communications  [43],  all  group  members  share  a 
secret  group  key,  Kc.  All  subgroup  members  in  region  i  share  a  secret  regional  key,  I<Ri.  In  addition,  each  region  has  a  leader 
and  all  leaders  in  the  system  share  a  leader  secret  key  ( I<RL )  for  efficiency  purposes.  In  summary,  there  are  three  keys  for 
hierarchical  group  key  management:  leader  key  [Krl),  regional  key  [I<R ),  and  group  key  (Kc).  These  keys  are  rekeyed  properly,  in 
part  or  in  whole,  as  events  happen  in  the  system,  including  group  join/leave/eviction,  node  failure,  regional  boundary  cross, 
and  group  merge/partition.  In  this  paper,  we  consider  additional  events  associated  with  IDS  activities,  including  intrusion 
detection  timer  events  and  eviction  events.  In  addition  to  maintain  secrecy,  our  region-based  key  management  scheme  also 
maintains  membership  consistency  through  three  membership  views:  (a)  Regional  View  (RV)  contains  regional  membership 
information  including  regional  (or  subgroup)  members’  ids  and  their  location  information,  (b)  Leader  View  (LV)  contains 
leaders’  ids  and  their  location  information,  and  (c)  Group  View  (GV)  contains  group  membership  information  that  includes 
members’  ids  and  their  location  information  [2,44]. 

For  robustness  and  distributed  control,  a  contributory  key  agreement  (C/<A)  protocol  based  on  Group  Diffie-Hellman 
(GDH)  [45]  is  used  to  generate  and  manage  a  shared  secret  key.  In  particular,  we  use  GDH.3  (the  optimized  version  of 
GDH)  allowing  the  use  of  fixed-sized  messages  and  only  a  constant  number  of  exponentiation  operations  executed  by  each 
participant.  With  these  features,  GDH.3  has  been  proposed  to  be  used  for  rekeying  by  mobile  devices  with  low  computational 
capabilities  [45].  If  an  inside  attacker  interrupts  the  rekeying  process,  then  a  timeout  event  will  be  triggered  to  restart  the 
rekeying  process.  A  smart  inside  attacker,  however,  may  decide  to  allow  the  rekeying  operation  to  be  executed  to  completion 
so  that  it  can  learn  of  the  new  secret  group  key  with  which  to  obtain  secret  group  information  and  cause  the  system  to  fail. 
If  it  continues  to  interrupt  rekeying,  it  may  eventually  be  caught  by  IDS  and  evicted  from  the  system  before  it  causes  the 
system  to  fail. 

A  regional  leader  will  be  selected  randomly  among  regional  members  so  that  the  adversary  will  not  have  a  specific  target 
to  launch  their  attacks.  We  add  randomness  to  the  leader  selection  process  by  introducing  another  hashing  function  H(x) 
that  takes  in  the  id  of  a  node  concatenated  with  the  current  location  of  the  node  as  the  hash  key.  The  node  with  the  smallest 
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returned  hash  value  would  then  become  the  regional  leader.  Since  regional  members  know  each  other’s  id  and  location, 
they  can  independently  execute  the  hash  function  to  determine  which  node  should  be  the  leader. 

Below  we  briefly  describe  how  the  extended  region-based  group  key  management  protocol  handles  group  partition, 
group  merge,  node  eviction,  group  communication,  and  intrusion  detection  events.  Standard  mobile  group  events  such  as 
member  join,  member  leave  and  mobility  induced  operations  are  described  in  [2]  and  will  not  be  repeated  here. 

Croup  partition:  Group  members  may  lose  connection  with  each  other  due  to  node  failure  or  node  mobility.  Thus,  a  group 
may  be  partitioned  into  multiple  groups  dynamically.  In  general,  group  partition  increases  as  the  node  density  decreases 
and  as  the  node  mobility  increases.  A  group  partition  event  starts  with  a  region  being  partitioned  and  detected  by  members 
in  the  region  missing  the  leader’s  beacon  message,  and  by  the  leader  missing  its  regional  members’  beaconing  messages. 
In  the  former  case,  a  new  leader  may  be  elected  following  a  leader  election  protocol.  In  the  latter  case,  the  leader  with  the 
remaining  nodes  in  the  partitioned  region  will  execute  GDH  to  agree  on  a  new  regional  key  I<R.  In  any  case,  all  leaders  in 
each  partitioned  group  will  execute  GDH  to  agree  on  a  new  leader  key  KRL. 

Croup  merge:  Two  groups  may  merge  into  one  when  connectivity  resumes.  A  group  merge  event  is  detected  by  members 
within  a  region  detecting  the  presence  of  non-group  members  by  listening  to  beacon  messages  issued  from  nodes  with  a 
different  group  ID.  After  authentication  a  new  regional  leader  in  a  merged  region  is  selected  following  the  leader  selection 
protocol.  Then  members  in  the  merged  region  will  execute  GDH  to  agree  on  a  new  regional  key  I<R  following  the  “group 
member  join ”  protocol  as  if  members  in  the  merged  region  had  just  newly  joined  the  group.  The  new  leader  in  the  merged 
region  then  coordinates  with  all  other  leaders  to  execute  GDH  to  agree  on  a  new  leader  key  I<RL.  Finally,  a  new  group  key  Kc 
is  generated  by  all  leaders  and  is  distributed  to  all  group  members  in  the  merged  group. 

Eviction:  After  a  node  is  detected  as  compromised,  the  group  key  Kc  is  rekeyed  based  on  GDH  to  evict  the  compromised 
node. 

Croup  communication:  Each  node  may  communicate  with  other  nodes  in  the  group  to  request  data.  We  assume  that 
the  time  interval  is  exponentially  distributed  with  rate  Xq.  A  node  may  disseminate  a  message  to  its  group  members  by 
sending  the  message  to  its  leader  using  KR,  the  leader  would  then  forward  the  message  to  other  leaders  using  I<RL ;  finally, 
each  leader  will  disseminate  the  message  to  all  its  regional  members  using  KR  based  on  multicasting. 

Intrusion  detection:  Messages  required  for  IDS  activities  follow  the  rules  for  group  communication,  including  status 
exchange,  vote-participant  selection,  vote-participant-list  dissemination,  and  vote  dissemination.  A  target  node  is  examined 
by  IDS  periodically  and  if  the  target  node  is  considered  compromised,  it  will  be  evicted  by  rekeying  the  group  key  Kc  based 
on  GDH. 

4.  System  model 


We  assume  that  the  GCS  is  in  MANETs  in  which  there  is  no  centralized  key  server  to  authenticate  and  authorize  individual 
group  members.  Each  node  is  preloaded  with  the  public  keys  of  all  other  group  members  for  authentication  purposes.  For 
efficiency  reasons,  a  group  key  is  used  for  group  communications  by  group  members.  The  group  key  is  rekeyed  by  running 
a  contributory  key  agreement  (CI<A)  protocol  such  as  GDH,  as  there  is  no  centralized  trust  entity  to  generate  and  disseminate 
the  group  key.  The  group  members  of  the  proposed  GCS  in  MANETs  are  assumed  to  be  spread  over  a  geographical  area 
A  =  nr2  with  r  being  the  radius. 

The  workload  and  operational  conditions  of  a  GCS  in  MANETs  can  be  characterized  by  a  set  of  model  parameters.  Assume 
that  a  node  may  leave  a  group  voluntarily  with  rate  /z  and  may  rejoin  any  group  with  rate  A  due  to  tactical  reasons.  Then, 
the  probability  that  a  node  is  in  any  group  is  A/(A.  +  /z)  and  the  probability  that  it  is  not  in  any  group  is  /z/(A.  +  /z).  Nodes 
can  move  freely  with  a  mobility  rate  of  a.  A  group  is  connectivity-oriented,  that  is,  nodes  that  are  connected  with  each 
other  form  a  group.  When  all  nodes  are  connected,  there  is  only  a  single  group  in  the  system.  Due  to  node  mobility,  a  group 
may  be  partitioned  into  two.  Conversely,  two  groups  may  merge  into  one  as  connectivity  resumes.  We  assume  that  the 
secure  GCS  is  designed  to  support  a  mission  critical  application.  All  nodes  are  charged  to  complete  a  mission  and  the  mission 
critical  application  allows  group  merging  and  partitioning  activities  in  response  to  network  dynamics.  However,  a  group, 
no  matter  what  its  size,  acts  independently  of  other  groups  with  the  intent  to  continue  with  mission  execution.  Nodes  in  a 
group  must  satisfy  the  forward/backward  secrecy,  confidentiality,  integrity,  and  authentication  requirements  for  secure  group 
communications  in  the  presence  of  outsider  and  insider  attacks.  Reliable  transmission  is  a  system  requirement  for  secure 
group  communications.  We  assume  that  view  synchrony  (VS)  is  guaranteed  in  GCSs  [2],  which  guarantees  that  messages  are 
delivered  reliably  and  in  the  proper  order  under  the  same  membership  view.  That  is,  a  receiver  will  see  the  same  membership 
view  as  viewed  by  the  sender.  While  maintaining  view  synchrony  introduces  many  control  messages,  it  will  only  affect  the 
communication  cost  incurred,  but  not  MTTSF  because  it  won’t  cause  a  security  failure  condition  to  be  satisfied. 

We  use  a  hexagonal  coverage  model  [2]  to  cover  the  operational  area.  While  there  are  many  ways  to  divide  the  operational 
area  into  multiple  regions,  we  choose  the  hexagonal  model  for  analytical  solution  conveniences.  The  solution  methodology 
can  be  applied  to  other  models  such  as  mesh  networks  without  loss  of  generality.  The  operational  geographical  area  A  =  nr2 
is  divided  into  R(n)  =  3 n2  +  3n  +  1  regions  such  that  there  are  61  regions  with  n  =  4,  37  regions  with  n  =  3,  19  regions 
with  n  =  2,1  regions  with  n  =  1,  and  1  region  with  n  =  0,  where  n  is  the  ring  level.  The  probability  that  a  member  moves 
across  a  boundary  between  two  regions,  denoted  by  PRM(n),  is  given  by: 


Prm  (hi)  = 


6(3n2  +  3n+l)-(12n  +  6) 
6(3n2  +  3n  +  1) 


(1) 
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It  can  be  shown  that  the  regional  mobility  rate  an  is  given  by: 

On  =  (2n  +  l)nPfiM(n).  (2) 

For  attacker  behaviors,  we  assume  the  presence  of  smart  inside  attackers  who  will  attempt  to  compromise  nodes  with 
a  variable  rate  depending  on  the  number  of  compromised  nodes  in  the  system.  Based  on  the  principles  given  in  [41],  we 
assume  that  the  attacker  strength  increases  proportional  to  the  number  of  compromised  nodes  in  the  system  because  of  the 
collusion  of  compromised  nodes.  To  dynamically  react  to  the  attacker  strength,  we  also  adaptively  increase  the  intrusion 
detection  rate  proportional  to  the  number  of  compromised  nodes  detected  to  counteract  intrusion. 

To  alleviate  collusion,  the  system  performs  region-voting-based  IDS  by  which  a  majority  of  m  vote-participants  must 
agree  in  order  to  evict  a  suspicious  target  node,  where  m  is  the  number  of  vote-participants  randomly  selected  from  group 
members  in  the  same  region  of  the  target  node.  Whether  one  should  select  vote-participants  from  a  larger  or  smaller  region 
depends  on  the  security  and  performance  requirements  of  the  GCS.  The  special  case  in  which  there  is  no  region  is  used  as  a 
baseline  model  against  which  region-voting-based  IDS  is  compared.  In  general,  region-voting-based  IDS  is  characterized  by 
its  false  negative  probability  (Pjn)  and  false  positive  probability  ( Pfp )  which  depend  on  pi  and  p2  in  host-based  IDS  and  the 
number  of  compromised  nodes  in  the  system. 

4.1.  Security  failure  conditions 

We  assume  that  the  GCS  enters  a  security  failure  state  when  any  mobile  group  fails.  A  mobile  group  fails  when  one  of  the 
two  conditions  below  is  true: 

Condition  Cl:  a  compromised  but  undetected  member  requests  and  subsequently  obtains  “secret”  data  using  the  group 
key,  representing  a  loss  of  integrity  [46]  security  failure. 

Condition  C2:  more  than  1/3  of  member  nodes  in  a  group  are  compromised  but  not  detected  by  IDS,  representing  a  loss 
of  availability  [46]  security  failure.  This  follows  the  Byzantine  Failure  model  [47]  such  that  when  more  than  1/3  of  member 
nodes  are  compromised,  the  system  fails. 

The  first  condition  represents  that  important  data  are  compromised.  The  second  condition  represents  that  the  mobile 
group  is  unable  to  function  correctly  and  is  compromised  as  a  whole.  Both  conditions  lead  to  security  failure. 

4.2.  Metrics 

We  use  the  following  two  metrics  to  measure  security  and  performance  properties  of  a  GCS  in  MANETs: 

MTTSF  (Mean  Time  to  Security  Failure):  This  metric  indicates  the  lifetime  of  the  GCS  before  it  reaches  a  security  failure 
state.  For  a  secure  GCS,  a  security  failure  occurs  when  either  Condition  Cl  or  Condition  C2  is  true.  As  a  security  metric,  lower 
MTTSF  means  a  faster  loss  of  system  integrity  or  loss  of  availability.  Therefore,  a  design  goal  is  to  maximize  MTTSF. 

Communication  cost(Ctotai ):  This  metric  indicates  total  control  message  traffic  incurred  per  time  unit  (s)  including 
group  communication,  status  exchange,  rekeying,  intrusion  detection,  beacon,  group  partition/merge,  and  mobility-induced 
activities.  A  high  Ctota|  translates  into  a  high  level  of  contention  over  the  wireless  channel  and  consequently  a  high  response 
time  for  group  communication  operations.  A  design  goal  is  to  minimize  the  average  Ctotai  by  exploring  the  tradeoff  between 
inter-regional  and  intra-regional  communication  costs.  The  computational  cost  is  not  included  in  Ctotai- 

4.3.  Attack  model 

We  integrate  region-based  group  key  management  with  region-voting-based  IDS  to  deal  with  both  outsider  and 
insider  attacks.  In  general,  an  outside  attacker  would  attempt  to  gain  authorized  access  and  then  perpetrate  as  an  inside 
attacker.  Below  we  discuss  possible  outsider  attack  scenarios  [48]  and  our  countermeasures.  An  outside  attacker  can  gain 
unauthorized  access  to  a  legitimate  account  by  eavesdropping  data  packets  or  any  message  containing  a  secret  key  for  more 
sophisticated  attacks.  We  use  individual  rekeying  [49]  at  three  different  levels  (i.e.,  regional  key,  leader  key,  and  group  key)  to 
prevent  loss  of  confidentiality.  An  outside  attacker  can  also  attempt  to  modify  a  data  packet  to  break  data  integrity.  We  use  a 
symmetric  key  (the  group  key)  shared  by  only  legitimate  group  members  to  prevent  loss  of  data  integrity.  An  outside  attacker 
may  impersonate  a  group  member  to  join  a  group.  We  have  each  node  preloaded  with  public  keys  of  all  other  group  members 
to  ensure  source  authenticity  and  to  prevent  potential  impersonation  attacks  during  the  authentication  process  of  a  new 
group  member’s  join.  Active  outsider  attacks  such  as  denial  of  service  (DoS)  attacks  can  also  be  eased  by  authentication.  An 
outside  attacker  may  forge  packets.  Since  only  legitimate  group  members  with  a  secret  key  distributed  can  understand  data 
packets  communicated  by  other  group  members,  forged  packets  will  be  discarded.  An  outsider  can  also  perform  jamming 
attacks.  The  standard  defense  against  jamming  includes  spread  spectrum  or  frequency  hopping  communication,  locating 
the  jamming  area  and  rerouting  traffic. 

Insider  attackers  are  compromised  nodes  disguising  themselves  as  legitimate  healthy  members  to  disrupt  the  system.  We 
discuss  insider  attack  scenarios  and  our  countermeasures  below.  An  adversary  can  collude  with  other  compromised  nodes 
so  as  to  more  efficiently  compromise  another  node  and  eventually  cause  the  system  to  fail.  For  example,  in  the  process 
of  region-voting-based  intrusion  detection,  an  adversary  can  cast  a  negative  vote  against  a  trusted  healthy  node  or  cast  a 
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Fig.  2.  SPN  performance  model. 

positive  vote  for  a  compromised  node.  An  inside  attacker  aims  to  obtain  secret  group  information  by  communicating  with 
other  group  members  with  a  legitimate  group  key.  Thus,  a  smart  inside  attacker  may  not  wish  to  interrupt  the  rekeying 
process  so  that  it  can  learn  of  the  new  group  key  with  which  to  obtain  secret  group  information.  When  this  happens, 
a  security  failure  due  to  Condition  Cl  occurs.  This  is  countered  by  invoking  IDS  at  the  optimal  rate  identified  by  our 
mathematical  model  to  detect  and  remove  compromised  nodes,  thus  prolonging  the  system  lifetime.  A  compromised  node 
can  share  information  with  other  nodes  including  both  outside  attackers  and  inside  attackers  to  more  easily  compromise 
other  nodes.  We  model  the  attacker  strength  being  proportional  to  the  degree  of  compromised  nodes  in  the  system,  which 
can  be  estimated  by  our  mathematical  model.  The  increase  in  attacker  strength  is  countered  by  increasing  the  IDS  detection 
rate  proportional  to  the  number  of  compromised  nodes  detected  by  IDS.  When  more  than  1/3  of  the  group  members  have 
been  compromised,  a  security  failure  occurs  due  to  Condition  C2.  This  is  also  countered  in  our  system  by  invoking  IDS  to 
detect  and  remove  compromised  nodes. 

5.  Performance  model 

We  develop  a  mathematical  model  based  on  SPN  as  shown  in  Fig.  2  to  describe  the  behaviors  of  a  GCS  instrumented 
with  IDS  to  deal  with  insider  attacks  and  region-based  group  key  management  to  deal  with  outsider  attacks  in  MANETs. 
The  purpose  of  developing  the  model  is  to  identify  optimal  settings  to  maximize  MTTSF  while  satisfying  the  performance 
requirement  in  terms  of  Ctotai-  The  SPN  model  is  constructed  as  follows: 

The  SPN  model  describes  the  behavior  of  a  mobile  group  as  it  evolves.  It  suffices  to  model  the  behavior  of  a  single  mobile 
group  as  the  GCS  fails  when  a  single  group  fails  for  mission-critical  applications.  A  mobile  group  may  be  partitioned  into 
two  and  may  merge  with  another  group  during  its  lifetime.  We  track  trusted  members,  compromised  members  undetected, 
and  compromised  members  detected  during  its  lifetime  to  understand  its  security  and  performance  characteristics. 

We  use  places  to  classify  nodes  except  for  place  Nc  which  holds  the  current  number  of  groups  in  the  system.  Specifically, 
place  Tm  holds  trusted  members,  UCm  holds  compromised  nodes  not  yet  detected  by  IDS,  and  DCm  holds  compromised 
nodes  that  have  been  detected  by  IDS.  Note  that  Tm,  UCm,  and  DCm  represent  nodes  in  one  group,  not  in  the  system.  Thus, 
the  numbers  of  nodes  in  places  Tm,  UCm,  and  DCm,  obtained  by  mark(Tm),  rnark(UCm),  and  mark(DCm),  respectively,  would 
be  adjusted  based  on  the  number  of  groups  existing  in  the  system  (obtained  by  mark(JVc)),  which  changes  upon  group 
merge/partition  events. 

We  use  transitions  to  model  events.  The  event  occurrence  times  are  either  deterministic  or  exponentially  distributed 
in  our  model.  Specifically,  T_FA  models  a  node  being  falsely  identified  as  compromised.  TJDS  models  a  compromised 
node  being  detected.  The  event  occurrence  times  to  T_FA  and  TJDS  are  fixed  periodic  times  due  to  the  use  of  the  base 
periodic  detection  interval  TiDS.  T_CP  models  a  node  being  compromised.  In  the  reliability  community,  it  is  well  accepted 
that  the  software/hardware  failure  time  follows  exponential  distribution.  Therefore,  it  is  justified  to  assume  that  the 
event  occurrence  time  of  T_CP  is  exponentially  distributed.  T_MER  and  T_PAR  model  the  group  merge  or  partition  event, 
respectively.  We  model  group  merge  and  partition  events  by  a  birth-death  process  (a  Markov  model)  as  shown  in  Fig.  3  so 
the  time  it  takes  for  a  group  partition  or  merge  event  to  occur,  in  T_MER  or  T_PAR,  is  inherently  exponentially  distributed. 
We  have  used  deterministic  distribution  for  T_RK  for  modeling  the  rekeying  time  in  the  analysis.  T_DRQ,  models  a  data 
leak  security  failure  (Condition  Cl).  The  assumption  that  packet  arrivals  follow  a  Poisson  distribution  is  well  accepted  in 
the  networking  community  and  has  been  used  in  many  previous  studies.  Therefore,  it  is  justified  to  assume  that  the  event 
occurrence  time  of  T_DRQ  is  exponentially  distributed. 

A  firing  of  a  transition  will  change  the  state  of  the  system,  which  is  represented  by  the  distribution  of  tokens  in  the  SPN. 
For  example,  mark(NG)  changes  upon  firing  T_MER  or  TJ’AR  since  the  number  of  groups  is  changed  upon  a  group  merge 
or  partition  event;  the  number  of  compromised  nodes  undetected  is  incremented  by  1  and,  thus,  place  UCm  will  hold  one 
more  token  when  T_CP  fires.  A  transition  is  eligible  to  fire  when  the  firing  conditions  associated  with  the  event  are  met.  This 
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Fig.  3.  A  birth-death  process  for  group  merging/partitioning  events. 

is  presented  by  (1)  its  input  place  must  contain  at  least  one  token  and  (2)  the  associated  enabling  guard  function,  if  exists, 
must  return  true.  For  example,  T_CP  is  enabled  to  fire  when  there  are  “good”  nodes  in  the  group,  that  is,  place  Tm  holds  at 
least  one  token,  and  the  enabling  function  associated  with  T_CP  returns  true. 

Except  for  tokens  contained  in  place  Nc,  we  use  a  “token”  in  the  SPN  model  to  represent  a  node  in  the  group.  The 
population  of  each  type  of  nodes  is  equal  to  the  number  of  tokens  in  the  corresponding  place.  Initially,  all  N  members 
are  trusted  in  one  group  and  put  in  place  Tm  as  tokens. 

Trusted  members  may  become  compromised  because  of  insider  attacks  with  a  node-compromising  rate  A(mc).  This  is 
modeled  by  firing  transition  T_CP  and  moving  tokens  one  at  a  time  (if  it  exists)  from  place  Tm  to  place  UCm.  Tokens  in  place 
UCm  represent  compromised  but  undetected  member  nodes.  We  consider  the  system  as  having  experienced  a  security  failure 
when  data  are  leaked  out  to  compromised  but  undetected  members,  i.e.,  Condition  Cl.  A  compromised  but  undetected 
member  will  attempt  to  compromise  data  from  other  members  in  the  group.  This  is  modeled  by  associating  transition  T_DRQ 
with  rate  Xq  *  mark(UCm).  Firing  transition  T_DRQ  will  move  a  token  into  place  GF,  at  which  point  we  regard  the  system  as 
having  experienced  a  security  failure  due  to  Condition  Cl. 

A  compromised  node  in  place  UCm  may  be  detected  by  IDS  before  it  compromises  data  in  the  GCS.  The  intrusion  detection 
activity  of  the  system  is  modeled  by  the  detection  function  with  rate  D(mj.  Whether  the  damage  has  been  done  by  a 
compromised  node  before  the  compromised  node  is  detected  depends  on  the  relative  magnitude  of  the  node-compromising 
rate  (A(mc))  versus  the  IDS  detection  rate  (D(md)).  When  transition  TJDS  fires,  a  token  in  place  UCm  will  be  moved  to 
place  DCm,  meaning  that  a  compromised,  undetected  node  now  becomes  detected  by  IDS.  For  region-voting-based  IDS, 
the  transition  rate  of  TJDS  is  mark(L/Cm)  *  D(md)  *  (1  —  P/n),  taking  into  consideration  the  false  negative  probability 
of  region-voting-based  IDS  used.  Region-voting-based  IDS  can  also  false-positively  identify  a  trusted  member  node  as 
compromised.  This  is  modeled  by  moving  a  trusted  member  in  place  Tm  to  place  DCm  after  transition  T_FA  fires  with  rate 
mark(Tm)  *  D(mj  *  P/p.  Note  that  region-voting-based  IDS  parameters,  Pfn  and  P/p,  can  be  derived  based  on  pi  and  p2,  the 
number  of  vote-participants  (m),  and  the  current  number  of  compromised  nodes  which  may  collude  to  disrupt  the  service 
of  the  system.  Later  we  will  show  how  we  may  parameterize  Pfn  and  PfP- 

Finally,  the  group  is  considered  as  experiencing  a  security  failure  if  either  one  of  the  two  security  failure  conditions, 
Condition  Cl  or  Condition  C2,  is  met.  This  is  modeled  by  making  the  group  enter  an  absorbing  state  when  either  Condition 
Cl  or  Condition  C2  is  true.  In  the  SPN  model,  this  is  achieved  by  associating  every  transition  in  the  SPN  model  with  an  enabling 
function  that  returns  false  (thus  disabling  the  transition  from  firing)  when  either  Condition  Cl  or  Condition  C2  is  met,  and 
returns  true  otherwise.  For  the  SPN  model,  Condition  Cl  is  true  when  mark(GF)  >  0  representing  that  data  have  been  leaked 
out  to  compromised,  undetected  members;  Condition  C2  is  true  when  more  than  1/3  of  member  nodes  are  compromised 
but  undetected  as  indicated  in  Eq.  (3)  below,  where  mark  ( UCm )  returns  the  number  of  compromised  but  undetected  nodes 
in  the  group  and  mark(Tm)  returns  the  number  of  trusted  healthy  nodes  in  the  group. 

rnark(DCm)  >  1 
mark(Tm)  +  mark(L/Cm)  3 


5.1.  Group  merge  and  partition 


We  model  group  merge  and  partition  events  by  a  birth-death  process  as  shown  in  Fig.  3.  When  the  system  has  i  groups, 
i.e.,  in  state  i,  the  group  partitioning  rate  is  Xnpi  and  the  group  merging  rate  is  ptnmj.  We  parameterize  merging/partitioning 
rates  by  means  of  simulation.  We  first  observe  the  number  of  merge  and  partition  events  by  simulation  for  a  sufficiently 
long  period  of  time  T.  We  next  observe  the  sojourn  time  S,  when  the  system  stays  in  state  i,  i.e.,  when  i  groups  are  present 
in  the  system.  Let  Nnm  i  and  Nnp  ,  be  the  numbers  of  group  merge  and  partition  events  observed  in  state  i,  respectively.  Then, 
by  first  order  approximation  the  merging/partitioning  rates  in  state  i,  represented  by  /xnm,,  and  Xnpj,  may  be  estimated  by: 


Pnm.i 


hinm.i 


X 


np.i  — 


(4) 


Note  that  the  merging/partitioning  rates  parameterized  in  Eq.  (4)  are  a  function  of  the  node  mobility  and  density  in 
general.  We  observe  that  when  node  density  is  high,  group  merge  is  more  likely  to  occur  than  group  partition,  thus  leading 
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to  a  smaller  number  of  groups  observed  in  the  system.  On  the  other  hand,  as  the  node  density  is  low,  the  system  is  more 
likely  to  stay  in  a  state  in  which  there  is  a  large  number  of  groups.  In  other  words,  when  the  node  density  is  low,  group 
partitioning  is  more  likely  to  occur  than  group  merging. 


5.2.  Calculation  of  security  and  performance  metrics 


We  calculate  MTTSF  using  the  concept  of  mean  time  to  absorption  in  the  SPN  model.  Specifically,  we  use  a  reward 
assignment  such  that  a  reward  of  1  is  assigned  to  all  states  except  absorbing  states  which  is  modeled  based  on  the  two 
security  failure  conditions  (i.e.,  if  either  Condition  Cl  or  Condition  C2  is  met,  the  system  fails).  By  this  reward  assignment,  a 
reward  of  1  (time  unit)  is  cumulatively  added  to  the  system  lifetime  with  every  time  unit  elapsed  until  the  system  fails. 

We  calculate  Ctotal  by  the  probability-weighted  average  of  Ctotai j  representing  the  communication  cost  incurred  per  time 
unit  (s)  in  state  i.  Specifically,  Ctotai  is  calculated  by  accumulating  Ctotai,i(t )  over  MTTSF  divided  by  MTTSF,  i.e., 


Ctotal  — 


CmF  cmaUi(t)dt 
MTTSF 


(-total. i  is  calculated  as: 


(5) 


Ctotai,/  —  Cgc,  i  T"  Cstatus.i  T"  Crekey,/  T"  C/ps,  i  “I-  Cbeacono  T"  Cmp  j  +  Cmobility,i  (6) 

where  CGC>i,  CstatuSii,  Crekey,i,  CIDSA,  CbeaconA,  CmpA,  and  CmobUityA  are  the  cost  components  for  group  communication,  status 
exchange,  rekeying,  intrusion  detection,  beacon,  group  partition/merge,  and  mobility  events,  respectively,  given  that  the 
number  of  groups  in  the  system  is  i.  The  calculation  of  Cbeacon.i,  CmpA,  and  Cmobmty.i  follows  that  in  [2],  Below  we  explain  how 
we  calculate  Ccca ,  CstatuSij,  Crekey,n  C/DSi,  based  on  the  SPN  model. 

Cgc.  i-  this  cost  includes  the  communication  cost  incurred  by  group  communication  activities.  It  is  calculated  by: 


Cgc.i  —  Aq  x  N  x  (bcc / e)  x  [Nregj0n  A  x  Hreg10n  x  Hiea(jer  ,]  (7) 

where  Xq  is  the  group  communication  rate,  N  is  the  number  of  active  group  members  in  the  single  group  observed  (i.e., 
mark(L/Cm)  +  rnark(Tm)),  bCc  is  the  message  size  (bits)  of  a  group  communication  packet,  e  is  the  channel  error  probability 
and  thus  bcc/e  is  the  total  number  of  bits  transmitted  before  the  packet  is  received,  NregionA  is  the  number  of  regions  in  a 
group  where  there  are  i  groups,  Hregion  is  the  number  of  hops  multicast  from  a  leader  to  all  regional  members,  and  Hieader,;  is 
the  number  of  hops  multicast  from  one  leader  to  all  other  leaders  in  a  group  where  there  are  i  groups.  The  number  of  hops 
in  H region  is  counted  based  on  the  use  of  a  binary  tree  for  a  multicasting  message.  Here  Aregi0n  is  the  area  of  a  region,  s  is  the 
circum-radius  of  a  hexagon-shaped  region,  R  is  the  wireless  per-hop  radio  range  (m),  and  i  is  the  number  of  groups  observed 
in  the  system  and  R(n)  returns  the  number  of  regions  in  the  entire  system.  When  there  are  i  groups  in  the  system,  the  radius 
of  a  group  can  be  approximated  as  r/Vi  where  r  is  the  radius  of  the  operational  area.  Hieader,;  returns  0  when  i  =  1  as  a 
special  case.  Hregion,  FI  leader,/,  and  NregionA  are  calculated  as  follows: 


Hrppinn  —  X  (N} 

K 


members 


—  1)  where  s  = 


3s/3 


^region  ^region  — 


R(n) 


FT leader,!  —  *  (^region, i  1) 


N, 


R(n) 


region,! 


Qtatus.i'-  this  cost  is  for  group  node  status  exchange  for  intrusion  detection.  It  is  calculated  by: 


(8) 

(9) 

(10) 


2.  _  (hi  x  (bs/e))  x  [Nregl0n  i  X  FfrCgI0n  T  Hieader,i]  .  . 

Cstatus.i  =  ~  (11) 

1  status 

where  TstatUs  is  the  periodic  time  interval  for  disseminating  a  status  exchange  message,  N  is  the  number  of  group  members, 
and  bs  is  the  message  size  (bits)  of  the  status  exchange  information. 

Crekey. i'-  this  cost  is  for  group  key  rekeying  due  to  join/leave  events  and  forced  evictions  to  evict  detected  compromised 
nodes.  It  is  calculated  as: 


Crekey.i  —  Cj0jn/ieavej  +  Ceviction,f 


(12) 


where  Cj0in/ieave,i  is  the  cost  introduced  by  leave  and  join  operations  per  time  unit  and  CeViction,!  is  the  cost  introduced  by 
forced  evictions  per  time  unit.  Cj0in/ieave,i  is  as  calculated  in  [2]  and  is  not  repeated  here.  CeViction,!  is  calculated  by: 


Ceviction,!  =  [: rate(TJDS )  +  rate(T_FA )]  x  CieaVe,i- 


(13) 


68 


J.-H.  Cho,  I.-R.  Chen  / Performance  Evaluation  68  (201 1 )  58-75 


Here  rate(TJDS)  gives  the  intrusion  detection  rate  and  rate(T_FA)  gives  the  false  alarm  rate  which  detects  trusted  nodes  as 
compromised  nodes.  Both  can  be  obtained  easily  from  the  SPN  model.  Cieavej  is  the  cost  per  leave  operation  when  there  are 
i  groups  in  the  system,  as  calculated  in  [2], 

C/us  i :  this  is  the  communication  cost  due  to  IDS.  For  region-voting-based  IDS,  this  cost  is  computed  as: 

C/DS, !  =  D(md )  x  (1  -Pfn)  xN  x  [bm_/ist  +  m  x  bv]/e  x  [H^der  i  +  ^region  xN  region,/  ]  (14) 

where  D(md)  is  the  detection  rate  proportional  to  the  number  of  compromised  nodes  detected  by  IDS  to  counter  increasing 
attacker  strength,  P/n  is  the  probability  of  false  negatives,  N  is  the  number  of  current  members  in  a  group,  m  is  the  number 
of  vote-participants  against  a  target  node,  bm_/,st  is  the  message  size  (bits)  of  the  list  containing  m  vote  participants,  bv  is  the 
message  size  (bits)  of  a  vote,  and  H “™der  t  is  the  number  of  hops  between  two  leaders  calculated  as  r /R\fi. 

6.  Numerical  results  and  analysis 

In  this  section,  we  present  numerical  data  for  MTTSF  and  Ctotai  obtained  when  given  a  set  of  parameter  values  character¬ 
izing  the  operational  and  environmental  conditions  and  show  that  there  exist  optimal  design  settings  in  terms  of  the  IDS 
detection  interval  and  the  regional  area  size  under  which  MTTSF  is  maximized  while  Ctotai  is  minimized  for  a  GCS  in  MANET 
environments.  We  first  parameterize  (i.e.,  give  values  to)  key  model  parameters  reflecting  the  operating  conditions  of  the 
GCS  and  then  we  present  numerical  results  and  analysis  with  physical  interpretations  given. 


6.1.  Parameterization 


Below  we  describe  how  to  assign  proper  values  to  several  key  model  parameters  to  reflect  the  operational  and  environ¬ 
mental  conditions  of  the  system. 

JV:  This  is  the  number  of  current  active  group  members  in  the  system.  This  number  evolves  dynamically  as  the  system 
evicts  compromised  nodes.  Since  a  node  leaves  the  group  voluntarily  with  rate  /a  and  joins  the  group  with  rate  X,  the 
probability  that  a  node  is  active  in  the  group  is  X/(X  +  /z)  and  the  probability  that  it  is  not  is  /z/(A  +  /z).  Let  Ntotai  be 
the  total  group  population  at  any  time.  Then,  N  =  NtotaiV(^  +  /-0-  In  the  SPN  model,  we  initially  place  (Nimt/mark(Nc))  x 
(X/(X  +  /z))  tokens  in  place  Tm,  as  the  initial  population  of  a  mobile  group.  As  the  system  evolves,  N  is  obtained  with 
mark(Tm)  +  rnark((JCm)  indicating  the  number  of  current  active  members  in  a  group. 

Tcm:  This  is  the  communication  time  required  for  disseminating  a  rekeying  message.  The  reciprocal  of  Tcm  is  the  rate  of 
transition  T_RK.  As  we  use  GDH  for  rekeying,  Tcm  can  be  calculated  by  CieaVe,i  divided  by  BW,  the  wireless  network  bandwidth 
(Mbps)  in  the  MANET. 

A(mc):  This  is  the  attacker  function  that  returns  the  rate  at  which  a  node  is  compromised  in  the  system.  It  is  also 
the  rate  of  transition  T_CP.  Specifically,  A(mc )  =  Xc  x  mc  where  Xc  is  the  base  compromising  rate  and  mc  represents 
the  degree  of  compromised  nodes  currently  in  the  system,  defined  by  the  ratio  of  N  to  the  number  of  good  nodes,  i.e., 
mc  =  (mark(DCm)  +  rnark(Tm))/mark(Tm).  The  base  compromising  rate  (A.c)  can  be  obtained  by  first-order  approximation 
from  observing  the  number  of  compromised  nodes  over  a  time  period. 

D(md ):  This  is  the  detection  function  [41]  that  returns  the  rate  at  which  intrusion  detection  is  invoked,  adjusted  based 
on  the  accumulated  number  of  nodes  that  have  been  detected  by  IDS.  It  is  also  the  transition  rate  of  TJDS  in  our  SPN  model. 
Specifically,  D(md)  =  md/Tms ,  where  TtDS  is  the  base  intrusion  detection  interval  and  md  represents  the  degree  of  nodes  that 
have  been  detected  by  IDS,  defined  by  the  ratio  of  Ninit  to  N,  i.e.,  md  =  JVimt/(mark(DCm)  +  mark(Tm)).  The  base  intrusion 
detection  interval  TI0S  is  a  design  parameter  to  be  adjusted  to  maximize  MTTSF  while  minimizing  Ctotai. 

Pfn  &  Pfp:  Pfn  is  the  probability  of  false  negatives  and  Pfp  is  the  probability  of  false  positives  based  on  vote-based  IDS.  We 
parameterize  them  by: 
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Here  p  corresponds  to  pi  or  p2  for  false  negative  or  false  positive  probability  for  Pfn  or  Pfp.  In  region-voting-based  IDS, 
Ngood  =  mark(Tm)/Nregion,i  and  Nbad  =  mark(DCm)/Nregion,i-  Basically,  Pjn  is  calculated  by  the  number  of  compromised 
nodes  incorrectly  diagnosed  as  trusted  healthy  nodes  (i.e.,  detecting  a  bad  node  as  a  good  node)  over  the  number  of  detected 
nodes.  On  the  other  hand,  Pfp  is  calculated  by  the  number  of  normal  nodes  incorrectly  flagged  as  anomalies  over  the  number 
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Table  2 

Main  parameters  and  default  values. 


Parameter 

Value 

Parameter 

Value 

Parameter 

Value 

X 

1/(60*60) 

o 

1/(60*60*32) 

Trb 

5s 

n 

1/(60*60*4) 

BW 

1  Mbps 

Tlb 

2  s 

Tids 

5-1200 s 

Ninit 

150  nodes 

m 

3 

7 status 

300  s 

D(md)/A(mc) 

Linear 

R 

200  m 

xc 

1/(60*60*24) 

r 

500  m 

bvote 

100  bits 

Xq 

1/30 

bGDH 

64  bits 

bm-list 

100  bits 

pi 

2% 

bcc 

800  bits 

e 

0.5 

p2 

2% 

bs 

400  bits 

Fig.  4.  MTTSF  versus  TJDs  with  pi  =  p2  =  0.005. 

of  detected  normal  nodes.  We  consider  intrinsic  defect  of  host-based  IDS  in  each  node  as  well  as  collusion  of  compromised 
nodes  in  region-voting-based  IDS,  so  a  compromised  participant  can  cast  a  negative  vote  against  a  healthy  target  node  and 
can  cast  a  positive  vote  for  a  malicious  node. 

6.2.  Data  and  analysis 

We  vary  values  of  key  design  parameters  to  analyze  their  effects  on  optimal  settings  (in  terms  of  the  best  regional  size 
and  the  best  IDS  interval)  under  which  the  system  performance  is  optimized.  Table  2  summarizes  default  parameter  values. 
We  test  the  effects  of  key  parameters  such  as  per-node  false  negative  or  false  positive  probabilities  (pi,  p2),  the  group 
communication  rate  (Xq)  and  the  compromising  rate  (A.c).  Note  that  for  region-voting-based  IDS,  Pfn  and  Pfp  are  calculated 
based  on  Eq.  (15). 

Figs.  4  and  5  analyze  the  optimal  settings  in  terms  of  the  IDS  detection  interval  TiDS  and  the  number  of  regions  Nregi0n 
under  which  MTTSF  is  maximized.  Fig.  4  is  for  the  case  of  low  pi  and  p2  values  while  Fig.  5  is  for  the  case  of  high  pi  and  p2 
values.  The  special  case  in  which  there  is  only  one  region  (Nregion  =  1)  is  also  considered. 

From  Figs.  4  and  5,  we  observe  that  there  exists  an  optimal  Tids  that  maximizes  MTTSF.  In  general,  as  Tms  increases, 
MTTSF  increases  until  its  optimal  TIDS  is  reached,  and  then  MTTSF  decreases  past  the  optimal  TiDS.  The  reason  of  decreasing 
MTTSF  after  reaching  the  optimal  point  is  that  the  false  positive  probability  (P/p)  increases  as  TIDS  decreases,  therefore 
resulting  in  more  nodes  being  falsely  identified  as  compromised  and  being  evicted  from  the  system.  Here  we  note  that 
decreasing  T/DS  (thus  performing  IDS  more  often)  in  effect  increases  the  number  of  good  nodes  being  misdiagnosed  as  bad 
nodes  over  time,  thereby  decreasing  the  number  of  good  nodes  (Ng00 d)  and  adversely  affecting  the  false  positive  probability 
computed  by  Eq.  (15).  Next,  we  observe  that  there  exists  an  optimal  regional  area  size  which  is  largely  dictated  by  pi 
and  p2  values.  Note  that  Pfp  is  one  aspect  of  false  alarms  generated  by  IDS,  which  increases  if  IDS  is  more  frequently 
triggered.  When  pi  =  p2  is  sufficiently  low  (i.e.,  pi  =  p2  =  0.005)  as  shown  in  Fig.  4,  the  best  MTTSF  is  found  with 
Nregion  =  1  at  T/Ds  =  30  s.  When  pi  =  p 2  is  sufficiently  high  (i.e.,  pi  =  p2  =  0.02)  as  shown  in  Fig.  5,  the  best  MTTSF  is 
identified  with  Nregion  =  19  at  Tids  =  120  s.  The  optimal  MTTSF  exists  due  to  the  tradeoff  between  the  positive  and  adverse 
effects  of  performing  IDS. 

Specifically,  whenpl  =  p2is  sufficiently  low,  PfP  and  Pfn  are  also  sufficiently  low  due  to  the  positive  effect  of  high  quality 
IDS,  so  the  system  benefits  the  best  by  using  m  vote-participants  out  of  a  large  region,  i.e.,  atNregion  =  1.  On  the  other  hand, 
when  pi  =  p2  is  high,  PfP  and  Pfn  are  also  sufficiently  high  due  to  the  adverse  effect  of  IDS,  so  the  system  benefits  the  best 
by  using  m  vote-participants  out  of  a  moderately  large  region  at  Nregion  =  19  such  that  it  still  has  a  good  chance  of  finding 
m  vote-participants  in  the  region  without  suffering  too  much  from  high  PfP  and  Pfn- 
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Fig.  5.  MTTSF  versus  TIDS  with  pi  =  p2  =  0.02. 
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(b)  Ctotai  versus  Nregi0n  with  pi  =  p2  =  0.02. 
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Fig.  6.  Ctotai  with  respect  to  TiDS  and  Nregion  with  pi  =  p2  =  0.02. 

Lastly,  we  notice  that  as  pi  =  p2  increases,  the  optimal  T/DS  increases.  This  is  again  due  to  the  tradeoff  between  the 
positive  and  adverse  effects  of  IDS.  When  high  pi  =  p 2  is  used,  triggering  IDS  less  frequently  will  generate  less  false  alarms. 
On  the  other  hand,  when  sufficiently  low  pi  =  p2  is  used,  triggering  of  IDS  often  will  improve  MTTSF  with  the  high  quality 
IDS  without  generating  frequent  false  alarms.  We  notice  that  at  Nregion  =  61  MTTSF  is  low  and  flat  because  of  the  very  low 
probability  of  finding  m  vote-participants  in  a  small  region,  resulting  in  IDS  not  being  used  much  in  the  system. 

Next  we  analyze  the  optimal  settings  in  terms  of  Tids  and  Nregion  under  which  Ctotai  is  minimized.  Fig.  6(a)  shows  Ctotai 
versus  TiDS  with  varying  Nre gion  while  Fig.  6(b)  shows  Ctotai  versus  Nregi0n  with  varying  T/Ds-  We  observe  that  Ctotai  varies 
depending  on  Nregl0n  and  that  there  is  an  optimal  TfDs  under  which  Ctotai  is  maximized.  Moreover,  when  TfDs  is  sufficiently 
large,  say  T/DS  >  30  s,  the  optimal  Nreg ion  is  19.  When  TIDs  =  15  s  or  30  s,  the  optimal  Nregion  is  at  7.  The  reason  is  that  when 
Tids  is  sufficiently  small,  the  node  density  tends  to  decrease  rapidly  because  of  frequent  intrusion  detection  activities  to  evict 
compromised  nodes.  In  this  case,  the  system  tends  to  favor  a  small  number  of  regions  (represented  by  Nregion  =  7)  to  reduce 
the  inter-regional  overhead,  as  in  this  case  the  inter-regional  overhead  will  dominate  the  intra-regional  overhead  since  the 
intra-regional  overhead  will  be  relatively  small  when  the  node  density  is  low.  Lastly  when  Nregion  >  37 ,  Ctotai  increases  again 
because  the  inter-regional  communications  cost  outweighs  the  intra-regional  communication  cost.  We  also  observe  that  in 
Fig.  6(a)  region-voting-based  IDS  with  no  region  (i.e.,  Nregion  =  1)  at  its  optimal  settings  (T/DS  =  5,  Nreg ion  =  1)  performs  the 
best  due  to  a  significant  reduction  of  active  members  because  of  high  false  positives  generated  by  IDS. 

Next  we  analyze  the  effect  of  pi  and  p2  on  the  optimal  settings  that  maximize  MTTSF  and  minimize  Ctotai.  Figs.  7  and  8 
summarize  the  results.  Here  for  each  MTTSF  or  Ctotai  curve,  T/Ds  is  chosen  at  its  optimal  value  to  isolate  out  of  its  effect.  As 
shown  in  Fig.  7,  when  pi  =  p 2  is  sufficiently  low  (0.005  or  0.007  or  0.01),  MTTSF  is  the  best  at  Nregl0 n  =  1-  However,  when 
pi  =  p 2  becomes  higher,  say  pi  =  p 2  >  0.01,  we  observe  the  best  MTTSF  at  Nregion  =  19.  The  results  correlate  well  with 
the  results  presented  earlier  in  Figs.  4  and  5,  so  the  same  physical  interpretation  applies  for  the  tradeoff  between  positive 
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Fig.  7.  MTTSF  versus  p  1  and  p2  with  varying  Nresion. 


Fig.  8.  Cfotai  versus  pi  and  p2  with  varying  f\lregion. 


and  negative  effects  of  IDS.  In  Fig.  8,  we  observe  that  when  pi  =  p2  is  sufficiently  low,  Ctotai  is  minimized  at  Nregion  =  19. 
However,  when  pi  =  p 2  becomes  higher,  say  pi  =  p2  >  0.01,  we  observe  that  Ctotai  is  minimized  at  Nregion  =  7  or 
1.  Again  this  result  correlates  well  with  those  presented  in  Fig.  6.  We  note  that  there  exists  a  tradeoff  between  security 
versus  performance.  While  MTTSF  is  maximized  at  Nregion  =  19  when  pi  =  p2  is  high  (e.g.,  >0.01),  Ctotai  is  minimized  at 
iVregion  =  19  only  when  pi  =  p2  is  low  (e.g.,  <0.01).  Consequently,  the  system  designer  should  select  the  best  settings  under 
which  the  system  requirements  on  MTTSF  and  Ctotai  are  best  satisfied. 

Here  we  note  that  a  major  advantage  gained  from  combining  IDS  with  hierarchical  key  management  is  that  intrusion 
detection  can  directly  leverage  the  region-based  key  management  infrastructure  to  efficiently  implement  IDS  functionality 
with  the  region-voting-based  IDS  design.  The  case  in  which  intrusion  detection  and  key  management  are  separately 
managed  can  be  represented  by  the  special  case  in  which  there  is  no  concept  of  region-based  management,  i.e.,  Nregion  =  1 
with  the  flat  architecture.  We  see  from  Figs.  6-8  that  our  integrated  region-based  scheme  at  identifying  optimal  settings 
can  significantly  outperform  the  no-region  special  case  in  the  total  network  traffic  incurred  per  time  unit.  For  example,  in 
Fig.  6(a)  when  TlDS  >  60  s,  we  see  that  the  optimal  setting  is  at  Nregion  =  19  for  minimizing  the  overall  traffic  per  time  unit 
under  which  our  integrated  region-based  scheme  significantly  performs  better  the  special  case  in  which  separate  intrusion 
detection  and  key  management  are  being  employed  with  Nregi0n  =  1. 

Finally,  we  analyze  the  effect  of  the  group  communication  rate  (Xq)  and  compromising  rate  (Ac)  on  MTTSF  or  Ctota|. 
Table  3  summarizes  the  optimal  Nregi0n  value  under  which  MTTSF  is  maximized  and/or  Ctotai  is  minimized.  We  fix  all  other 
parameters  at  their  default  values.  We  observe  that  as  Xq  increases,  the  optimal  Nregl0n  value  increases,  while  the  resulting 
MTTSF  decreases.  The  reason  for  a  low  MTTSF  when  Xq  is  high  is  mostly  due  to  security  failure  in  Condition  Cl.  We  observe 
that  as  Xq  increases,  Ctota|  increases  because  of  the  increased  group  communication  cost. 

For  the  effect  of  Xc,  we  observe  that  as  Xc  increases,  the  optimal  Nregion  value  decreases  while  the  resulting  MTTSF 
decreases.  The  reason  is  that  with  a  high  Xc  there  are  more  compromised  nodes  in  the  system  and  the  system  will  benefit 
more  from  using  a  smaller  Nregion  to  increase  the  probability  of  being  able  to  find  m  vote-participants  in  order  to  more 
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Table  3 

Effect  of  A.,  and  Ac  on  optimal  NTegion,  MTTSF  and  Ctotai- 


Xq 

(Nregion.  MTTSF) 

(Nregion »  Qotal)  ^-c 

(Nregion,  MTTSF ) 

(Nregion 5  Qotal) 

Once  per  15  s 

Once  per  30  s 

Once  per  1  min 

Once  per  5  min 

Once  per  10  min 

(19, 92459) 

(19, 96909) 

(19, 103919) 

(1,  145390) 

(1,  177034) 

(1,216431) 
(1,  186933) 

(1, 172210) 
(19,  83969) 
(19, 64544) 

Once  an  hr 

Once  per  12  h 

Once  a  day 

Once  per  2  days 

Once  per  4  days 

(1,9618) 

(1,54515) 

(19,  96909) 

(19, 184938) 

(19, 359233) 

(19,855480) 

(7,  326509) 

(1,  186933) 

(1,  106724) 
(1,64329) 

Table  4 

Optimal  settings  of  region-based  group  key  management  integrated  with  voting-based  IDS. 

Optimal  settings  for  maximizing 
MTTSF 

Optimal  settings  for  minimizing  Ctotai 

Host-based  IDS  false  negative/false  positive 
probabilities  (pl/p2) 

High 

Small  regional  area  size 

Large  Tms 

Large  regional  area  size 
Large  T,DS 

Low 

Large  regional  area  size 

Small  TIDS 

Small  regional  area  size 
Small  Tms 

Group  communication  rate  (A?) 

High 

Small  regional  area  size 

Small  T,ds 

Large  regional  area  size 
Small  Tms 

Low 

Large  regional  area  size 

Large  Tms 

Small  regional  area  size 
Large  Tms 

Compromising  rate  (Ac) 

High 

Large  regional  area  size 

Small  TIDS 

Small  regional  area  size 
Small  Tms 

Low 

Small  regional  area  size 

Large  Tms 

Large  regional  area  size 
Large  Tms 

effectively  evict  compromised  nodes.  We  also  observe  that  when  Xc  increases,  the  optimal  Nregi0n  in  minimizing  Ctotai 
increases  while  the  resulting  Ctotai  increases  slightly.  The  reason  that  Ctotai  increases  slightly  as  Xc  increases  is  that  when 
there  are  more  compromised  nodes  in  a  group.  IDS  is  triggered  more  frequently,  thus  increasing  the  overall  cost.  The 
reason  that  the  optimal  Nregion  value  increases  when  Xc  increases  is  that  the  inter-regional  communication  overhead  in  C/DS,/ 
(disseminating  a  message  from  a  member  to  a  leader)  is  small  compared  with  the  intra-regional  communication  overhead 
(disseminating  a  message  from  each  leader  to  its  regional  members),  so  the  system  favors  a  large  Niegion  to  reduce  the 
intra-regional  overhead.  Here  again  we  see  that  a  tradeoff  exists  between  security  versus  performance.  The  system  designer 
should  select  the  bestNregi0n  such  that  both  the  security  requirement  (in  terms  of  MTTSF)  and  performance  requirement  (in 
terms  of  Ctotai)  can  be  best  satisfied. 

Table  4  summarizes  optimal  settings  of  our  proposed  integrated  hierarchical  group  key  management  protocol  integrated 
with  region-voting-based  IDS.  under  which  MTTSF  is  maximized  and/or  Ctota|  is  minimized.  This  provides  guidelines  for 
system  designers  to  fine-tune  the  regional  area  size  and  the  IDS  detection  interval  to  best  satisfy  application-imposed 
security  and  performance  requirements.  The  general  trend  is  that  when  (pi,  p2),  Xq  or  Xc  is  low,  low  MTTSF  and  high  Ctotai 
are  observed,  and  vice  versa.  For  MTTSF,  the  optimal  regional  area  size  and  the  optimal  intrusion  detection  interval  identified 
are  dictated  by  the  tradeoff  between  positive  effects  of  IDS  (e.g.,  removing  compromised  nodes  as  soon  as  possible  not  to  be 
vulnerable  to  them)  and  negative  effects  of  IDS  (e.g.,  false  positives  and  negatives  generated  by  triggering  IDS).  For  Ctotai,  the 
optimal  regional  area  size  is  primarily  determined  by  the  tradeoff  between  inter-regional  and  intra-regional  communication 
overheads  while  the  optimal  intrusion  detection  interval  is  dictated  by  the  tradeoff  between  IDS  related  communication  cost 
and  group  communication  cost. 


6.3.  Comparative  analysis 

In  this  section,  we  perform  a  comparative  analysis  of  regional-based  integrated  management  against  two  baseline 
schemes:  (a)  a  separate  management  scheme  in  which  intrusion  detection  and  group  key  management  are  being  employed 
separately,  with  m  =  3  nodes  being  selected  for  performing  voting-based  IDS  functions,  and  (b)  a  separate  management 
scheme  in  which  intrusion  detection  and  key  management  are  being  employed  separately,  with  only  one  node  being  selected 
for  IDS  functions,  i.e.,  without  voting-based  IDS. 

Figs.  9  and  10  compare  MTTSF  and  Ctotai  obtained  for  these  three  schemes  as  a  function  of  the  host-based  IDS  false 
negative/false  positive  probabilities  (pl/p2),  with  T/DS  fixed  at  its  optimal  value  under  which  the  MTTSF  of  the  integrated 
management  scheme  is  maximized.  As  shown  in  Fig.  9,  region-based  integrated  management  has  the  best  MTTSF  among  all. 
Furthermore,  the  difference  in  MTTSF  is  more  significant  as  the  host-based  IDS  false  probability  increases.  On  the  other  hand, 
separate  management  without  voting-based  IDS  performs  significantly  worse  than  others  due  to  the  fact  that  it  relies  on 
only  a  single  evaluator  to  make  diagnosis  decisions.  From  Fig.  1 0,  we  observe  that  the  MTTSF  advantage  gained  by  integrated 
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Fig.  9.  MTTSF  of  integrated  management  versus  separate  management  schemes. 
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Fig.  10.  Qotai  of  integrated  management  versus  separate  management  schemes. 


management  over  separate  management  does  not  come  with  the  penalty  of  increased  control  message  traffic,  because 
integrated  management  can  directly  leverage  the  region-based  hierarchical  infrastructure  to  efficiently  implement  both 
key  management  and  IDS  functions.  Together  with  Figs.  9  and  10,  we  conclude  that  integrated  management  yields  higher 
MTTSF  while  producing  lower  control  message  traffic  compared  with  separate  deployment  of  key  management  and  IDS 
techniques. 


7.  Conclusion 

In  this  paper,  we  proposed  and  analyzed  region-based  hierarchical  group  key  management  integrated  with  voting-based 
IDS  to  deal  with  both  outsider  and  insider  security  attacks  for  a  GCS  in  MANETs  for  efficiency,  scalability  and  survivability.  Our 
results  showed  that  there  exist  optimal  settings  in  terms  of  the  optimal  regional  area  size  and  IDS  intrusion  detection  interval 
to  maximize  the  mean  time  to  security  failure  while  minimizing  the  total  communication  cost  of  the  GCS.  Furthermore,  we 
showed  that  our  proposed  region-based  integrated  scheme  outperforms  existing  schemes  for  which  intrusion  detection  and 
group  key  management  are  being  employed  separately.  The  tradeoff  between  security  and  performance  can  be  summarized 
by  a  system  designer  into  a  lookup  table  listing  optimal  settings.  Then  at  runtime  for  a  given  set  of  parameter  values 
characterizing  the  operational  and  environmental  conditions  of  MANETs  observed  dynamically,  the  GCS  can  perform  a  table 
lookup  operation  to  select  optimal  settings  for  maximizing  the  mean  time  to  security  failure  (representing  the  lifetime  of  the 
GCS)  while  minimizing  the  total  communication  cost. 
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In  the  future,  we  plan  to  investigate  if  trust-based  IDS  techniques  can  be  applied  to  better  cope  with  collusion  of 
compromised  nodes.  Also  in  this  paper,  we  considered  the  use  of  PKI  with  preloaded  public  keys  of  all  group  members 
for  source  authenticity  in  the  bootstrapping  period.  As  future  work,  we  plan  to  investigate  the  use  of  dynamic  public  key 
management  and  trust-based  management  for  source  authenticity. 
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